1 Initial Corpus generation & inspection

1.1 Corpus Creation

  1. Scopus download of documents retrieved from search string from Markard et al. (2012). Limited to LANGUAGE = ENGLISH AND TYPE = (ARTICLE).
  2. Selecting “seed” publications. 1% most cited per year. Ex-post manual exclusion. Results in 53 seed papers
  3. Retrieving for each seed 1000 publications with most shared references. Again, same limitations as in step 1.
  4. Adittional ex. post filtering. First, based on citations recieved and connectivity in bibliographic coupling network. Namely, I excluded edges in the bottom 10% quantile of the weight distribution (Jaccard weighted), also unconnected and nodes in the bottom 10% of the degree distribution. Lastly,after the community detection exercise, I excluded nodes in communities of less than 500 members.

That leads to an overall corpus size of:

## Number of unique publications in the final corpus:  12716

1.2 Seed Paper

In the following, we more in detail investigate the seed papers. Notice the adittional tabs for details on ou selection of seed papers.

1.2.1 Seep papers and corpus size

Generally, 50 x 1000 = 50.000 documents downloaded. However, due to an overlap of publications with most shared references to seed papers, final corpus is smaller.

First insight: It appears the main Sustainability corpus seems saturated, expansion appears more in adjacent fields.

1.2.2 List of all seed papers

NEEDS UPDATE

1.3 Publications

Note: The following tables refer to the documents in the main corpus. Generally, dgr refers to the degree, n to the number of publications. Subscript .f indicates the number is fractionalized (divided by the number of elements per publication)